在空中/卫星图像分析(遥感)的许多应用中,对象的精确形状的产生是一个麻烦的任务。在诸如计数对象的大多数遥感应用中,只需要对象的位置估计。因此,在空中/卫星图像中定位对象质心是一种容易解决物体的精确形状而不是必需的解决方案。因此,本研究侧重于评估使用深神经网络来定位卫星图像中对象质心的可行性。我们的模型的名称是质心 - UNET。质心 - UNET模型基于经典U-Net语义分段架构。我们修改并调整了U-Net语义分段架构的质心检测模型,保留了原始模型的简单性。此外,我们已经测试并评估了我们的模型,其中包括涉及空中/卫星图像的两种案例研究。这两种案例研究正在建立质心检测案例研究和椰子树心脏检测案例研究。与其他方法相比,我们的评估结果达到了良好的准确性,并且还提供简单性。本研究下开发的代码和模型也可在Centroid-UNET Github存储库中提供:https://github.com/gicait/centroid- inet
translated by 谷歌翻译
In this paper we explore the task of modeling (semi) structured object sequences; in particular we focus our attention on the problem of developing a structure-aware input representation for such sequences. In such sequences, we assume that each structured object is represented by a set of key-value pairs which encode the attributes of the structured object. Given a universe of keys, a sequence of structured objects can then be viewed as an evolution of the values for each key, over time. We encode and construct a sequential representation using the values for a particular key (Temporal Value Modeling - TVM) and then self-attend over the set of key-conditioned value sequences to a create a representation of the structured object sequence (Key Aggregation - KA). We pre-train and fine-tune the two components independently and present an innovative training schedule that interleaves the training of both modules with shared attention heads. We find that this iterative two part-training results in better performance than a unified network with hierarchical encoding as well as over, other methods that use a {\em record-view} representation of the sequence \cite{de2021transformers4rec} or a simple {\em flattened} representation of the sequence. We conduct experiments using real-world data to demonstrate the advantage of interleaving TVM-KA on multiple tasks and detailed ablation studies motivating our modeling choices. We find that our approach performs better than flattening sequence objects and also allows us to operate on significantly larger sequences than existing methods.
translated by 谷歌翻译
变压器与卷积编码器结合使用,最近已使用微型多普勒特征用于手势识别(HGR)。我们为HGR提出了一个基于视觉转换器的架构,该体系结构具有多腹腔连续波多普勒雷达接收器。所提出的架构由三个模块组成:一个卷积编码器,带有三个变压器层的注意模块和一个多层感知器。新型的卷积解码器有助于将具有较大尺寸的斑块喂入注意力模块,以改善特征提取。用与两种抗连续波多普勒雷达接收器相对应的数据集获得的实验结果(Skaria等人出版)证实,所提出的体系结构的准确性达到了98.3%,从而实质上超过了现状的阶段。 - 在使用的数据集上进行艺术。
translated by 谷歌翻译
在典型的客户服务聊天方案中,客户联系支持中心以便帮助或提高投诉,人类代理商试图解决这些问题。在大多数情况下,在谈话结束时,要求代理人写一份简短的总结强调问题和建议的解决方案,通常是为了使其他可能需要处理同一客户或问题的其他代理商的利益。本文的目标是推进此任务的自动化。我们介绍了第一个大规模,高质量的客户服务对话框摘要数据集,接近6500人的注释摘要。数据基于现实世界的客户支持对话框,包括提取和抽象摘要。我们还介绍了一种特定于对话框的新无监督的提取摘要方法。
translated by 谷歌翻译